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The Bell-Kochen-Specker theorem establishes the impossibility of a noncontextual hidden vari- 
able model of quantum theory, or equivalently, that quantum theory is contextual. In this paper, 
an operational definition of contextuality is introduced which generalizes the standard notion in 
three ways: (1) it applies to arbitrary operational theories rather than just quantum theory, (2) 
it applies to arbitrary experimental procedures rather than just sharp measurements, and (3) it 
applies to a broad class of ontological models of quantum theory rather than just deterministic 
hidden variable models. We derive three no-go theorems for ontological models, each based on an 
assumption of noncontextuality for a different sort of experimental procedure; one for preparation 
procedures, another for unsharp measurement procedures (that is, measurement procedures associ- 
ated with positive-operator valued measures), and a third for transformation procedures. All three 
proofs apply to two-dimensional Hilbert spaces, and are therefore stronger than traditional proofs 
of contextuality. 
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I. INTRODUCTION 

Traditionally, a noncontextual hidden variable model 
of quantum theory is one wherein the measurement out- 
come that occurs for a particular set of values of the 
hidden variables depends only on the Hermitian opera- 
tor associated with the measurement and not on which 
Hermitian operators are measured simultaneously with 
it. For instance, suppose A,B and C are Hermitian op- 
erators such that A and B commute, A and C commute, 
but B and C do not commute. Then the assumption of 
noncontextuality is that the value predicted to occur in 
a measurement of A does not depend on whether B or C 
was measured simultaneously. The Bell-Kochen-Specker 
theorem shows that a hidden variable model of quantum 
theory that is noncontextual in this sense is impossible 
for Hilbert spaces of dimension three or greater 0, [3] • 

The traditional definition of noncontextuality is lack- 
ing in several respects: (1) it does not apply to an ar- 
bitrary physical theory, but is rather specific to quan- 
tum theory; (2) it docs not apply to unsharp measure- 
ments, that is, those associated with positive-operator 
valued measures (POVMs), nor does it apply to prepa- 
ration or transformation procedures; and (3) it does not 
apply to ontological models wherein the outcomes of mea- 
surements are determined only probabilistically from the 
complete physical state of the system under investiga- 
tion, for instance, indetcrministic hidden variable models 
or ontological models of quantum theory lacking hidden 
variables. In this paper, we propose a new definition: 

A noncontextual ontological model of an op- 
erational theory is one wherein if two exper- 
imental procedures are operationally equiv- 



alent, then they have equivalent representa- 
tions in the ontological model. 



This definition will be explained in section [H] of this 
article, where we provide a precise account of what it 
is for two experimental procedures to be operationally 
equivalent, and describe what is meant by an ontologi- 
cal model of an operational theory, specifying in particu- 
lar how different experimental procedures (preparations, 
measurements and transformations) arc represented in 
such a model. We also explain why it is appropriate 
to call this sort of ontological model noncontextual by 
providing an operational definition of an experimental 
context. 
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In section IHIl we specialize our definition to the case 
of quantum theory. We provide examples of the sorts of 
contexts that can arise for preparations, transformations 
and measurements, and describe what an assumption of 
noncontextuality for each type of procedure implies for 
an ontological model of quantum theory. In the case of 
measurements, we also generalize the object that is ex- 
amined for context-dependence from outcomes to proba- 
bilities of outcomes, and discuss the motivation for doing 
so. Further, we show how the traditional notion of non- 
contextuality is subsumed as a special case of our gener- 
alized notion when the outcomes of sharp measurements 
are assumed to be uniquely determined by the complete 
physical state of the system under investigation. 

In sections IIVI Ivl and IVII we provide no-go theorems 
for ontological models based on the assumption of non- 
contextuality for preparations, unsharp measurements, 
and transformations, respectively. All three proofs apply 
to two-dimensional (2d) Hilbert spaces, and are there- 
fore stronger than traditional no-go theorems for non- 
contextuality, which require Hilbert spaces of dimension 



three or greater 1 . In section [Vj we also provide a no-go 
theorem for noncontextuality of unsharp measurements 
based on a recent generalization of Gleason's theorem to 
2d Hilbert spaces |a, |(j . Section IVIII provides a general 
discussion of the motivation and plausibility of noncon- 
textuality for different sorts of procedures, and section 
IVIIII investigates the connection between these different 
sorts of noncontextuality and the assumption that mea- 
surement outcomes are uniquely determined by the com- 
plete physical state of the system under investigation. 
Some conclusions and questions for future research are 
presented in section llXl 



II. 



DEFINITIONS OF NONCONTEXTUALITY 
FOR ANY OPERATIONAL THEORY 



In an operational interpretation of a physical theory 
the primitive elements are preparation procedures, trans- 
formation procedures, and measurement procedures. 
These are understood as lists of instructions to be im- 
plemented in the laboratory. The role of an operational 
theory is merely to specify the probabilities p(k\P, T, M) 
of different outcomes k that may result from a measure- 
ment procedure M given a particular preparation pro- 
cedure P, and a particular transformation procedure T. 
When there is no transformation procedure, or when it 
is considered to be part of the preparation or the mea- 
surement, we have simply p(k\P,M). 

Given the rule for determining probabilities of out- 
comes, one can define a notion of equivalence among 
experimental procedures. Specifically, two preparation 
procedures are deemed equivalent if they yield the same 
long-run statistics for every possible measurement proce- 
dure, that is, P is equivalent to P' if 



p{k\P, M) = p(k\P', M) for all M. 



(1) 



Two measurement procedures are deemed equivalent if 
they yield the same long-run statistics for every possible 
preparation procedure, that is, M is equivalent to M' if 
their outcomes can be associated one-to-one such that 



p(fc|P, M) = p(k\P, M') for all P. 



(2) 



Finally, two transformation procedures are deemed 
equivalent if they yield the same long-run statistics for 
every possible preparation procedure that may precede 
and every possible measurement procedure that may fol- 
low, that is, T is equivalent to T' if 



p(k\P, T, M) = p{k\P, T', M) for all P, M. 



(3) 



1 Recent work by Cabello y| generalizes the notion of contextual- 
ity to unsharp measurements in a manner that is different from 
the proposal of this paper. From our perspective, this work 
makes use of an assumption of deterministic outcomes for un- 
sharp measurements that cannot be justified by an assumption 
of noncontextuality. This issue is discussed in section IVllfl 



It follows that one can distinguish two types of features 
of an experimental procedure: the first type of feature is 
one that is specified by specifying the equivalence class 
that the procedure falls in, while the second type is one 
that is not. The set of features of the second type - those 
that are not specified by specifying the equivalence class 
- we call the context of the experimental procedure. Note 
that by our definition of an experimental context having 
knowledge of the context does not enable one to predict 
the outcome of an experiment any better than if one only 
knew the equivalence class of the experimental procedure. 

An example from quantum theory should clarify the 
notion of a context. Consider the following different mea- 
surement procedures for photon polarization. The first, 
which we denote by Mi , constitutes a piece of polaroid 
oriented to pass light that is vertically polarized along 
the z axis, followed by a photodetector. The second, 
which we denote by M2, constitutes a bircfringent crys- 
tal oriented to separate light that is vertically polarized 
along the z axis from light that is horizontally polarized 
along this axis, followed by a photodetector in the verti- 
cally polarized output. The third and fourth procedures, 
denoted M3 and M4 are identical to Mi and M2 respec- 
tively, except that they are defined relative to an axis n 
that is skew to the z axis. It turns out that the statistics 
of outcomes for Mi are the same as those for M2 , for all 
preparation procedures, and those for M 3 are the same 
as those for M4. However, the statistics of outcomes for 
the first pair are different from those of the second. Thus, 
Mi and M2 fall in one equivalence class of measurements, 
and M3 and M4 fall in another. The orientation of the 
polaroid or calcite crystal is an example of the first sort 
of feature of an experimental operation, one whose vari- 
ation involves a variation in the operational equivalence 
class of the procedure. On the other hand, whether one 
uses a piece of polaroid or a bircfringent crystal to mea- 
sure photon polarization is a feature of the measurement 
procedure of the second type; a variation of this feature 
does not change the equivalence class of the procedure. 
It is therefore part of the context of the measurement 
procedure. 

To properly define a noncontextual ontological model 
of an operational theory, it is not enough to have a defi- 
nition of context; we also need to specify precisely what 
we mean by an ontological model. We turn to this now. 

An ontological model is an attempt to offer an explana- 
tion of the success of an operational theory by assuming 
that there exist physical systems that are the subject of 
the experiment. These systems are presumed to have at- 
tributes regardless of whether they arc being subjected to 
experimental test, and regardless of what anyone knows 
about them. These attributes describe the real state of 
affairs of the system. Thus, a specification of which in- 
stance of each attribute applies at a given time we call 
the ontic state of the system. If the ontic state is not 
completely specified after specifying the preparation pro- 
cedure, then the additional variables required to specify 
it are called hidden variables. Although most ontologi- 



cal models do involve hidden variables, this is not always 
the case. For instance, the ontic states may be associ- 
ated one-to-one with the equivalence classes of prepara- 
tion procedures (as is the case for pure preparation pro- 
cedures in the Beltrametti-Bugajski model of quantum 
theory j£j ) . We shall denote the complete set of variables 
in an ontological model by A, and the space of values of 
A by Q. 

Within an ontological model of an operational the- 
ory, preparation procedures are preparations of the ontic 
state of the system. However, the procedure need not fix 
this state uniquely; rather, it might only fix the probabil- 
ities that the system be in different ontic states. Thus, 
someone who knows that a system was prepared using 
the preparation procedure P describes the system by a 
probability density /xp(A) over the model variables. 

Similarly, measurement procedures are measurements 
of the ontic state of the system. Again, these proce- 
dures need not enable one to infer the identity of the 
ontic state uniquely, nor need they even enable one to 
infer a set of ontic states within which the actual on- 
tic state lies. Rather, they might only enable one to infer 
probabilities for the system to have been in different ontic 
states. In this, the most general case, the outcome of the 
measurement is not uniquely determined by A. Only the 
probabilities of the different outcomes are so determined. 
Thus, for every value of A, one associates a probability 
^M,fe(A) which is the probability of obtaining outcome k 
in a measurement M given that the system is in the ontic 
state A. We call ^M,fe(A) an "indicator function" 2 . 

Finally, transformation procedures are transformations 
of the ontic state of the system. These may be stochas- 
tic transitions. Thus, a transformation procedure T is 
represented by a transition matrix, Ft (A', A), which rep- 
resents the probability density for a transition from the 
ontic state A to the ontic state A'. 

Thus, within an ontological model, if the prepara- 
tion procedure is P, and the measurement procedure 
is M, then the probability assigned to outcome k is 
the probability assigned to outcome k given A, aver- 
aged over all A, weighted by the probability of A, that 
is, JdA/ip(A)^M,fc(A). If there is a transformation pro- 
cedure T intervening between the preparation and mea- 
surement, and this is associated with the transition ma- 



2 Some might argue that within the framework of an ontological 
model, the term measurement ought to be reserved for a proce- 
dure which reveals some attribute of the system under investiga- 
tion. However, we feel that it is suitable for any procedure that 
leads to an update in one's information about which instance 
of some attribute applied, or equivalently, in one's information 
about what the ontic state of the system was prior to the pro- 
cedure being implemented. Note that even this weak notion of 
what constitutes a measurement fails for some experimental pro- 
cedures that lead to distinct outcomes, namely, those wherein 
the probabilities of the different outcomes are independent of 
the ontic state. For simplicity however, we shall not introduce 
any novel terminology for this exceptional case. 



trix rx(A',A), then the probability of outcome k is 
/dA'dA£M,fc(A')r T (A',A) M p(A). 

Summarizing, an ontological model assumes: (1) every 
preparation procedure P is associated with a normalized 
probability density over the ontic state space, up : Q, — > 
[0,1] such that J/ip(A)dA = 1; (2) every measurement 
procedure M with outcomes labelled by k is associated 
with a set of indicator functions {£ivi,fc(A)}fc over the on- 
tic states, that is, a set of functions £,M,k : fi — > [0, 1] 
satisfying X)fc6vi,fc(A) = 1 for all A; (3) every transfor- 
mation procedure T is associated with a transition matrix 
r T : ft x n -* [0,1] such that JT T (A', A)dA' = 1 for all 
A, and (4) the predictions of the operational theory are 
reproduced exactly by the model, that is, 



P (k\P, T, M) = J dXd\ £ M ,fc(A') r T (A', A) MP (A) 



(4) 



for all P , T, and M. 

In general, the representation of an experimental pro- 
cedure in an ontological model might depend on both its 
equivalence class and its context. 3 It is natural, how- 
ever, to consider the possibility of a model wherein the 
representation of every experimental procedure depends 
only on its equivalence class and not on its context. Af- 
ter all, a natural way to explain the fact that a pair 
of preparation (measurement, transformation) procedures 
are operationally equivalent is to assume that they pre- 
pare (measure, transform) the ontic state of the system 
in precisely the same way. We shall call such an onto- 
logical model noncontextual. Any operational theory 
that admits such a model shall also be called noncontex- 
tual 4 . In general, if any set of procedures is represented in 
a context-independent way within an ontological model, 
we shall say that the model is noncontextual for those 
procedures. 

It is useful to explicitly characterize the assumption of 
noncontextuality for preparations, transformations, and 
measurements. We will call an ontological model prepa- 
ration noncontextual if the representation of every prepa- 
ration procedure is independent of context, that is, if 



MP (A) =jU e (p)(A) 



(5) 



where e(P) is P's equivalence class. Similarly, we will 
call a model measurement noncontextual if the represen- 
tation of every measurement procedure is independent of 
context, that is, if 



£M,fc(A) — £e(M),fc(A) 



(6) 



3 If one allows such generality, then - perhaps contrary to a com- 
mon impression — it is possible to provide an ontological model of 
quantum theory. The dcBroglie-Bohm theory |jj is an example. 

4 This terminology allows one to use the phrase " quantum theory is 
contextual" as a shorthand for "quantum theory does not admit 
a noncontextual ontological model" , much as it is common to use 
the phrase "quantum theory is nonlocal" in place of "quantum 
theory does not admit a local ontological model" . 



where e(M) is M's equivalence class. Finally, a model 
is called transformation noncontextual if the representa- 
tion of every transformation procedure is independent of 
context, that is, if 



r T (A / ,A) = r e(T) (A',A) 



(7) 



where e(T) is T's equivalence class. A universally non- 
contextual ontological model is one that is noncontextual 
for all experimental procedures: preparations, transfor- 
mations, and measurements. 



III. DEFINITIONS OF NONCONTEXTUALITY 
IN QUANTUM THEORY 

We begin with a quick review of the operational ap- 
proach to quantum theory, described, for instance, in 

Refs. [niniiEm- 

An equivalence class of preparation procedures is as- 
sociated with a density operator p. This is a positive 
trace- 1 operator over the Hilbert space Ti of the system: 
p > 0, Tr(p) = 1. Rank-1 density operators are simply 
projectors onto rays of Hilbert space, and are called pure. 

An equivalence class of measurement procedures is 
associated with a positive operator valued measure 
(POVM) {E k }. A POVM is an ordered set {E k } of pos- 
itive operators that sum to identity, ^ fc Ek = I. The 
fcth element, Ek, is associated with the fcth outcome. 
Specifically, given a preparation associated with a den- 
sity operator p, the probability of the outcome k is simply 
Tr(pEk). It is useful to single out the POVMs whose ele- 
ments are idempotent, that is, those for which E\ = E k 
for all k. Since idempotent positive operators are pro- 
jectors, these POVMs are called projective-valued mea- 
sures (PVMs). The associated measurements are said to 
be sharp. These are the sorts of measurements that are 
considered in standard textbook treatments of quantum 
mechanics. A Hcrmitian operator defines a PVM through 
the projectors in its spectral resolution. 

Finally, an equivalence class of transformation proce- 
dures is associated with a completely positive (CP) map 
T. A CP map T is a positive linear map on the space 
of operators over Ti such that T £§) X is a positive linear 
map on the space of operators over Ti g) TC , where Ti' 
is of arbitrary dimension, and X is the identity map on 
Ti' . Unitary maps, familiar from textbook treatments of 
quantum theory, are reversible CP maps. 

A preparation procedure associated with a non-rank- 
1 density operator p can be implemented in as many 
ways as there are convex decompositions of p. Sup- 
pose {(j>k,Pk)} is a convex decomposition of p, that is, 
p = ^2 k PkPk- If one generates a random number accord- 
ing to the distribution pk, and upon obtaining the num- 
ber k, one implements the preparation associated with 
Pk, this procedure is a member of the equivalence class 
of procedures associated with p. Another way of imple- 
menting a preparation procedure that is associated with 
p is to implement a preparation of a purification \ip) of p 



on Ti <E) Ti' (a purification of p is any state \ip) such that 
Trw |-0) (ip\ = p)- The equivalence class therefore also 
contains members associated with different purifications 
of p. 

The assumption of preparation noncontextuality in 
quantum theory is that the probability distribution over 
ontic states that is associated with a preparation proce- 
dure P depends only on the density operator p associated 
with P. 



/i P (A) = Pp(X). 



(8) 



In particular, the distribution does not depend on the 
particular convex decomposition of p or on the particu- 
lar purification of p that is used in the preparation pro- 
cedure. 

The multiplicity of contexts for transformation pro- 
cedures parallels the multiplicity of contexts for prepa- 
ration procedures. Transformation procedures that are 
associated with non-unitary CP maps can be obtained 
as a convex sum of unitary maps in many different ways, 
each one of which corresponds to a distinct transforma- 
tion procedure, and can also be obtained by implement- 
ing a unitary map on a larger system that incorporates 
the system of interest [lj| . 

The assumption of transformation noncontextuality in 
quantum theory is that the transition matrix that is asso- 
ciated with a given transformation procedure T depends 
only on the CP map T associated with T, 



r T (A',A) = r T (A',A). 



(9) 



It does not, for instance, depend on the particular convex 
sum of unitaries or the particular unitary on a larger 
system by which the transformation was implemented. 

In the case of quantum measurements, there are also 
many sorts of contexts. For instance, every fine-graining 
of a non-maximally informative measurement (i.e. a mea- 
surement associated with a POVM at least one element 
of which is not rank-1) provides a different context. Sup- 
pose the POVM {Fj} is a fine-graining of {Ek}, which 
is to say that there is a partitioning of the outcomes j 
into sets Sk such that Ek = J2jes k Fj- By implementing 
a measurement associated with the POVM {Fj}, then 
discarding all information about j except the set Sk to 
which it belongs, one implements a measurement in the 
equivalence class associated with the POVM {Ek}- 

Despite the fact that the independence of represen- 
tation on fine-graining is traditionally the full extent of 
the assumption of noncontextuality for measurements (as 
we will show below), it is not difficult to see that there 
arc many other sorts of contexts. For instance, there 
is a context for every convex decomposition of a non- 
maximal measurement. A convex decomposition of a 
POVM {Ek} is defined as a probability distribution {p a } 
and a set of POVMs, {T a } where T a = {F£}, such 
that Ek = ^2 a PaF£. By sampling a from the distri- 
bution {p a }, then implementing the measurement asso- 
ciated with the POVM {F£}, and registering only the 



outcome of this measurement, one implements a measure- 
ment in the equivalence class associated with the POVM 
{Ek}. There is also a context for every way of obtaining 
a POVM by coupling to an ancilla and measuring a PVM 
on the composite of system+ ancilla [ljj . 

The assumption of measurement noncontextuality in 
quantum theory is that the set of indicator functions rep- 
resenting a measurement M depends only on the POVM 
{Ek} associated with M, 



£M,fc(A) = C{s fc },fc(A). 



(10) 



In addition to admitting new sorts of contexts for mea- 
surements, our generalized notion of measurement con- 
textuality involves a slight revision of what it is that de- 
pends on the measurement context. 

In the past, measurement contextuality has only been 
considered within the framework of deterministic hidden 
variable theories, and the question of interest has been 
whether or not the measurement outcome for a given on- 
tic state of the system depends on the context of the 
measurement. However, for objectively indeterministic 
ontological models, it is clear that the natural question 
to ask is whether the probabilities of different outcomes 
for a given ontic state of the system depend on the con- 
text. This is analogous to Bell's [l|j generalization of 
the notion of locality from measurement outcomes be- 
ing causally independent of parameter settings at space- 
like separation to the probabilities of measurement out- 
comes being causally independent of parameter settings 
at space-like separation 5 . This distinction was intro- 
duced by Bell in order to cleanly separate the notion of 
locality from the notion of determinism. Similarly, our 
generalized definition allows one to cleanly separate the 
notion of measurement noncontextuality from the notion 
of determinism. 

Once the question is posed, it is somewhat obvious that 
if there is to be any notion of measurement contextual- 
ity within objectively indeterministic ontological mod- 
els, the appropriate quantities to examine for context- 
dependence are the probabilities of different outcomes for 
a given ontic state of the system. A less obvious feature 
of our generalized notion of measurement contextuality 
is that the probabilities of outcomes are the appropri- 
ate quantities to examine for context-dependence even in 
objectively deterministic ontological models. The key is 
that the latter sort of model may still exhibit an epis- 
temic indeterminism, wherein knowledge of the equiv- 
alence class of the measurement together with the ontic 



5 More specifically, Bell 1 1(1 defined a theory to be locally deter- 
ministic if the variables in space-time region I are determined by 
the variables in a space-time region that fully closes the backward 
light-cone of I, and locally causal if the probability distribution 
over values for a variable in space-time region I are determined by 
a specification of the values of all the variables in the backward 
light-cone of I ( "determined" in the sense that further condition- 
ing on variables in the region outside the backward light-cone 
would not change the probability distribution). 



state of the system under investigation does not uniquely 
fix the outcome. To explain this properly, we need to 
consider some of the details of the mathematical repre- 
sentation of these measurements. 

The distinction between the ontic state of the system 
determining the outcome and determining only the prob- 
abilities of different outcomes is captured mathematically 
within an ontological model by the sorts of indicator func- 
tions one uses to represent measurements. The former 
case is represented by an indicator function that is idem- 
potent, that is, one for which x(A) 2 = x(A) ( wc shall de- 
note idempotent indicator functions by x(A) rather than 
£(A)). Such functions are necessarily equal to one in some 
region of the ontic state space and zero elsewhere. By 
virtue of the fact that a set of indicator functions must 
satisfy J2kXkW = 1, if ah the indicator functions are 
idempotent, then the latter must be nonovcrlapping, that 
is, X&(A)xfc'(A) = for k ^ k! . Thus, for every value of 
A, only a single indicator function in the set {Xfe(A)} re- 
ceives the value 1 while the others receive the value 0. 
Since the value of the fcth indicator function at a given 
A specifies the probability of the kth outcome given the 
ontic state A , the outcome of the measurement is de- 
termined for all ontic states if and only if the latter is 
represented by a set of idempotent indicator functions. 

We shall call the assumption that a particular mea- 
surement is represented by a set of idempotent indica- 
tor functions the assumption of outcome determinism for 
that measurement. 

Now note that even within an objectively determinis- 
tic ontological model, measurements may fail to exhibit 
outcome determinism: specifying the ontic state of the 
system under investigation together with the equivalence 
class of the measurement procedure may be insufficient 
to uniquely fix the outcome. The outcome might only be 
fixed uniquely by supplementary features of the measure- 
ment procedure (which constitute part of the context of 
the measurement by our definition), such as microscopic 
degrees of freedom of the apparatus. Because the indica- 
tor function for a measurement specifics the dependence 
of the outcome on the ontic state of the system under 
investigation, and not the dependence of the outcome on 
the ontic state of any systems that make up the measure- 
ment apparatus or the environment, such a measurement 
must be represented by a non-idempotent set of indicator 
functions. Nonetheless, it may still be the case that for 
each equivalence class of measurements, all the elements 
of the class are represented by the same non-idempotent 
set of indicator functions, and this is all that is required 
for the measurements to be deemed noncontextual by our 
definition. 

As an example, consider a classical system and a clas- 
sical measurement device that generates an outcome by 
rolling one of several differently weighted dice, with the 
choice of the dice being determined by the ontic state of 
the system. Two such devices are only found to be oper- 
ationally equivalent if all of the dice of one are weighted 
in the same way as those of the other. Thus, every device 



in the equivalence class is represented by the same set of 
indicator functions, and consequently one has measure- 
ment noncontcxtuality by our definition. The underly- 
ing ontological model (classical mechanics) is objectively 
deterministic, but in order to predict the outcome of a 
particular measurement, one must supplement the ontic 
state of the system by the precise initial configuration of 
the dice and their environment, features that form part 
of the context of the measurement. Thus, although the 
outcome of the measurement clearly depends on the con- 
text, we take this to be a failure of outcome determinism 
rather than a failure of measurement noncontextuality 6 . 

Thus, it is really the notion of outcome determinism, 
rather than the notion of determinism, which we seek to 
cleanly separate from the notion of measurement noncon- 
textuality through our generalized definition. This makes 
our definition of measurement noncontextuality revision- 
ist insofar as the traditional definition implicitly incor- 
porated the assumption of outcome determinism, while 
ours does not. This suggested revision in terminology is 
motivated by the idea that what is crucial to the notion 
of a noncontextual ontological model is that it reproduces 
the equivalence class structure of the operational theory. 

It is worth noting that, given the additional assump- 
tion of outcome determinism, one can recover the tradi- 
tional definition of measurement noncontextuality as a 
special case of our definition. Specifically, if one consid- 
ers only sharp measurements and one represents these by 
sets of idempotent functions (i.e. one assumes outcome 
determinism for these measurements), then the assump- 
tion of the independence of the representation of a mea- 
surement on the fine-graining of the PVM with which it 
is implemented is just the traditional notion of noncon- 
textuality (described in the introduction). This can be 
seen as follows. Specifying whether a Hcrmitian operator 
A is measured together with B or with C is equivalent 
to specifying a fine-graining of the PVM {Pfc} that is 
defined by the spectral resolution of A; the simultane- 
ous cigenspaces of A and B define one such fine-graining, 
while the simultaneous eigenspaces of A and C define 
another. Specifying the eigenvalue assigned to an oper- 
ator A for every value of A is equivalent to specifying 
a set of idempotent indicator functions; the values of A 
in the support of the function associated with outcome 
k are simply those that assign the fcth eigenvalue to A. 
Clearly then, assuming that the value assigned to A is 
independent of whether A is measured together with B 
or C is equivalent to assuming that the set of idempo- 
tent indicator functions associated with the PVM {Pk} 
is independent of the fine-graining by which it was im- 
plemented. 



No-go theorems based on the traditional definition of 
noncontextuality apply only in Hilbert spaces of dimen- 
sionality three or greater. Moreover, one cannot extend 
such proofs to 2d Hilbert spaces because there are no 
fine-grainings of non-trivial PVM measurements in a 2d 
Hilbert space, and fine-graining is the only notion of con- 
text that is recognized traditionally. However, by ap- 
pealing to preparations, transformations, and unsharp 
measurements, which admit many contexts even in a 2d 
Hilbert space, proofs of contextuality can be achieved 
here as well. From this perspective, the restriction of 
previous proofs of contextuality to 3d Hilbert spaces was 
an artifact of a limited notion of a context. 

Among the new proofs of contextuality that we shall 
present, the proof of preparation contextuality is the sim- 
plest, and so we begin with this case. 



IV. PROOF OF PREPARATION 
CONTEXTUALITY IN 2D 

There are two features of the representation of prepa- 
ration procedures in an ontological model that are central 
to our proof. The first concerns distinguishability, and 
the second convex combination. 

Feature 1 If two preparation procedures, P and P' are dis- 
tinguishable with certainty in a single-shot measurement, 
then their associated probability distributions, fi(X) and 
//(A), are nonoverlapping, that is, 



A*(A)//'(A) =0 for all A. 



(11) 



6 By our definition, any classical theory is necessarily noncontex- 
tual for all experimental procedures. This highlights another 
virtue of our particular definition: contextuality, in all of its 
manifestations, is found to be a nonc lassical phenomenon. For 
an opposing perspective, see Ref. Il7l . 



This feature can be understood as follows. Suppose 
one wishes to perform a measurement that discriminates, 
with certainty, between two probability distributions. In 
other words, one wishes to perform a measurement that 
allows one to retrodict, with certainty, which distribu- 
tion applied. This is only possible if the distributions to 
be discriminated are nonoverlapping. The reason is that 
if the two distributions overlapped in some region of the 
space of ontic states, then whenever the actual ontic state 
was in that region (and it will sometimes be in that re- 
gion, because the region is assigned nonzero probability 
by both distributions) , no measurement would be able to 
distinguish with certainty whether the system had been 
prepared using one or the other distribution, since the ac- 
tual ontic state is consistent with both. Thus, if, within 
an operational theory, a pair of preparation procedures 
arc distinguishable with certainty, then the only way an 
ontological model of the theory can account for this fact 
is by associating these procedures with nonoverlapping 
distribution functions. 

The second feature of an ontological model that is crit- 
ical to our proof is the manner in which convex combina- 
tions of preparation procedures are represented. Suppose 
that the preparation procedures P and P' are represented 
by distributions n(X) and //(A). Now suppose that a bit 
is generated uniformly at random from the distribution 



p, 1 — p, and the value of the bit is used to determine 
whether P or P' is implemented, after which the bit is 
forgotten. This effective procedure, which we call P", 
must be represented within the ontological model by a 
distribution /u"(A) satisfying 



M "(A)= W (A) + (1-P)//(A). 



(12) 



The reason is as follows. The probability that the ontic 
state of the system is A given procedure P", is simply the 
sum of the probability that it is A given procedure P and 
the probability that it is A given procedure P', weighted 
by the respective probabilities of P and P' given P". 
Thus, we have: 

Feature 2 A convex combination of preparation proce- 
dures is represented within an ontological model by a 
convex sum of the associated probability distributions. 

With these facts in hand, we now proceed with the 
proof. 

Consider a set of six pure preparations, denoted 
P a , P^, P&, Ps, P c , and P^, corresponding to the normal- 
ized Hilbcrt space vectors 



iJ'B 



(1,0) 

(0,1) 

(1/2.V3/2) 

(\/3/2,-l/2) 

(1/2.-V3/2) 

(V3/2,l/2) 



(13) 



or, equivalcntly, the rank 1 density operators 
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a A 



Ob 




1 



0"B = 



|V3 § ) 



3 



oc = 



l 

3 



l 

4 

-iV3 
kV3 



(14) 



One can easily verify the following orthogonality condi- 
tions 



&a&A = 0, 
&b&B = 0, 

a c a c = 0. 



(15) 
(16) 
(17) 



Now consider the preparation procedure wherein one of 
P a or P^ is implemented, with the choice being made uni- 
formly at random (for instance, by flipping a fair coin), 



and with no record being made of the choice. Denote this 
procedure by P a A- Define procedures PbB and P c c sim- 
ilarly. Consider also the preparation procedure wherein 
one of P a , P&, or Pc is implemented, again, with equal 
probabilities for each, and without recording the choice. 
We denote this by P a bc- The procedure Pabc is defined 
similarly. 

These procedures are represented in the quantum for- 
malism by the appropriate convex sums of the density 
operators in Eq. (|14fl . It turns out that all of these con- 
vex sums yield the same rank-2 density operator, namely, 



1/2 = 



h 



(18) 



commonly referred to as the 'completely mixed state'. 
Specifically, 



1/2 



1 



1 



(19) 
(20) 
(21) 
(22) 
(23) 



In figure ^ we present the Bloch ball representation 
of the seven density operators defined above. This pro- 
vides a graphical synopsis of the relevant orthogonality 
relations and convex structure. 
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FIG. 1: The Bloch ball representation of the six pure states 
and the five convex decompositions of the completely mixed 
state used in the proof of preparation contextuality. Each 
convex decomposition is represented by a convex polytope 
whose vertices represent the elements of the decomposition 
|1S| . The 2-element decompositions in our example are rep- 
resented by line segments, and the 3-element decompositions 
by equilateral triangles. 

Within an ontological model of operational quantunr 
theory, each preparation procedure P^, is associated with 
a probability distribution fi x (X). Now note that if two 
density operators, a and a', are orthogonal in the vector 



space of operators, that is, a a' = 0, then the associ- 
ated preparation procedures can be distinguished with 
certainty in a single-shot measurement (for instance, for 
preparations associated with orthogonal Hilbert space 
vectors, one simply implements the measurement asso- 
ciated with an orthogonal basis that includes these vec- 
tors). By feature 1 of ontological models (described 
above), distinguishable procedures are represented by 
nonovcrlapping distributions. Thus, from Eqs. 115| I- 117}1 
we can infer that 



M<j(A)ma(A) = 0, 
/i b (A)/i B (A) = 0, 
Mc(A)mc(A) = 0. 



(24) 
(25) 
(26) 



Furthermore, in any ontological model a convex 
combination of preparation procedures is represented 
by a convex sum of the associated probability dis- 
tributions (feature 2 above). Thus, if the proce- 
dures P a A,PbB, ■ ■ ■ , Pabc are represented by distribu- 
tions Moa(A), Mbs(A), . . . , mabc(A), the manner in which 
these procedures are obtained by convex combination of 
P a , Pb, . . . , Pc implies that 



MaA(A) = 


;jMa(A) + -MA (A) 


(27) 


Mbs(A) = 


^W + ^W 


(28) 


Mcc(A) = 


2^c(A) + -Mc(A) 


(29) 


Mabc(A) = 


gMa(A) + -Mb(A) + -/i c (A) 


(30) 


Mabc(A) = 


-/U(A) + ^Ms(A) + -A*c(A). 


(31) 



The assumption of a preparation noncontcxtual onto- 
logical model is that the distribution associated with a 
preparation procedure depends only on the operational 
equivalence class of that procedure, and thus only on the 
density operator associated with that procedure. Since 
the procedures P a A, PbB, ■ ■ ■ , Pabc are all represented 
by 1/2, they must all be represented by the same distri- 
bution in a preparation noncontextual ontological model. 
Thus, we require fi a A = MbB = • ■ ■ = jiabc- Denoting 
this distribution by v(\), we have simply 



"(A) 



1 



Mo(A) + -ma (A) 



Mb(A) + ttMs(A) 



Mc(A) + -mc(A) 



Ma(A) + -Mft (A) + ^Mc(A) 



Ma(A) + -Mb(A) + -mc(A). 



(32) 
(33) 
(34) 
(35) 
(36) 



We now show that there is no set of distributions sat- 
isfying Eqs. J22J-© an d E q s - ©-©• Consider the 



values of the various probability densities at a fixed value 
of A. We denote these simply as Ma, MA, ••• ; Mc- We show 
that the only solution to all the constraints, for a fixed A, 
is fi a , ^A, ■ ■ ■ , fJ-c = 0, which we call the all-zero solution. 

To satisfy Eqs. (|24|) - (|26|l . one of the pair /i a and \xa 
must be zero, as must be one of the pair fit and Mb and 
one of the pair \i c and fie- In all, there are eight possible 
assignments of zeroes that satisfy Eqs. (|24|) - (|26f) . Wc 
consider each of these in turn. 

If we have [i a , Mb, ^c — then by Eq. I|35|l we have v = 
0, and by Eqs. (|32[) - l|34|) . we conclude that /ia, Mb, l^c = 
0, so that we have the all-zero solution. If, instead we 
have {j, a , Mb, ^c = then by combining Eq. Q34J) and Eq. 
(f 351) we find tjMc = k/J-c for which the only solution is 
He = 0. But this gets us back to the first case, and the 
all-zero solution. Every other case yields the all-zero so- 
lution by virtue of the symmetry of the problem under 
rotations by multiples of 60 degrees in the Bloch sphere 
representation. 

The above argument did not depend on A, and thus 
for all A the only solution is the all-zero solution. Con- 
sequently, the only set of distributions that satisfy Eqs. 
P ^ -lffi) )! and Eqs. Ip£2 )l -l|3l) )l is the set of uniformly zero 
distributions, Ma (A), ma(A), ■ • • , Mc(A) = 0. But such 
distributions are not probability distributions since they 
are not normalized to one. This concludes the proof. 

I am grateful to Terry Rudolph for having improved 
upon my original proof of preparation contextuality by 
proposing the highly symmetric example presented here. 



V. PROOFS OF CONTEXTUALITY FOR 
UNSHARP MEASUREMENTS IN 2D 

Proofs of measurement contextuality have usually 
arisen only in the context of sharp measurements, that is, 
those associated with PVMs 7 , and outcome determinism 
has been assumed for such measurements. We shall make 
the same assumption here for sharp measurements, but 
we shall be considering unsharp measurements as well, 
that is, those associated with POVMs, and for these, out- 
come determinism will not be assumed. It is important to 
note that the "proofs of contextuality" presented in the 
next two subsections are contingent on the assumption of 
outcome determinism for sharp measurements. The sta- 
tus of this assumption will be revisited in section I Villi 
where wc will clarify what, precisely, has been proven. 



A. A proof based on a finite set of measurements 

Consider three binary-outcome measurements, 
M a ,Mb, and M c , associated respectively with PVMs 
{P a , Pa}, {Pb,P B } and {F C ,P C }, where P a projects 



Exceptions arc Refs. I4 IT7IIT9I 



onto the ray spanned by ip a , Pa projects onto the ray 
spanned by ipAi and so forth, with the vectors ip x being 
those that are defined in Eq. (|13Jl 8 . 
By the definition of a PVM, we have 



and 



P a +P A = I, 

P b + P B = /, 

Pc + P C = I, 
PaP A = 0, 

PbP B = o, 
P c Pc = 0. 



(37) 
(38) 
(39) 

(40) 
(41) 
(42) 



Given the assumption of outcome determinism for 
sharp measurements, the representations of M Q , M&, and 
M c in an ontological model are the sets of idempotent 
indicator functions {Xa(A),XA(A)}, {xb(A),xs(A)}, and 
{Xc(A),xc(A)} respectively. By definition, these must 
satisfy 



and 



Xo(A)+xa(A) = 1, 

Xb(A)+Xs(A) = l, 

Xc(A) + xc(A) = l, 

Xa(X)XA{\) = 0, 

X & (A)xb(A) = 0, 
Xc(A)xc(A) = 0. 



(43) 

(44) 
(45) 

(46) 
(47) 
(48) 



Now consider choosing one of M a , Mf,, and M c at ran- 
dom, with probability 1/3 for each, implementing the 
chosen measurement, and only registering whether the 
first (small letter) or the second (capital letter) outcome 
occurred. Call the effective measurement procedure that 
results M. It is associated with the POVM 



{^a 



\p h 



^Pc^Pa 



\Pb + \Pc}- 



(49) 



In an ontological model, a convex combination of mea- 
surements procedures is represented by an element-wise 
convex sum of the associated sets of indicator functions 
(for the same reason that an ontological model has fea- 
ture 2 of section H*V|) . Thus, M is represented by the set 
of indicator functions 



{\Xa{\) + |X6(A) + ±Xe(A), 

|x.4(A) + ix B (A) + ixc(A)}. 



Note that the POVM (U is equal to 9 



(51) 



But it is clear from this way of writing the POVM that 
the measurement has a random outcome regardless of the 
preparation procedure, since Tr(p^^) = \ regardless of p. 
It then follows that the equivalence class of measurement 
procedures that contains M also contains the "measure- 
ment" procedure M that completely ignores the system 
and just flips a fair coin to determine the outcome. Now 
consider how the measurement M is represented in the 
ontological model. Because the outcome doesn't depend 
on the system at all, it follows that regardless of the value 
of A, there is a probability of 1/2 for each outcome, so it 
is represented by the set of indicator functions 



i 1 ±} 

l 2' 2 J ' 



(52) 



where each element should be thought of as a uniform 
function over A of height i 10 . 

By the assumption of measurement noncontextuality, 
the measurement M must be represented by the same set 
of indicator functions as the measurement M. It follows 
that the set of functions (|50f) must be equal to the set of 
functions (|52|) . However, this constraint is inconsistent 
with the constraints (gSJ-lgHJ). To satisfy Eqs. l|4*3" fl -l|4*8" )l 
it is necessary that for every value of A, one of Xa(A) 
and Xa(A) must be equal to and the other equal to 
1. The same is true of x&(A) and Xs(A) and of Xc(A) 
and Xc(A). The eight possible assignments of values to 
these six quantities leave the set of functions (|50|l with 
the values {0,1}, {1,0}, {|, |} or {±, |} but never {i, \~\. 
This concludes the proof. 



A proof based on the 2D version of Gleason's 
theorem 



The impossibility of noncontextuality for unsharp mea- 
surements and outcome determinism for sharp measure- 
ments can also be established in a 2d Hilbert space by 
making appeal to a recent Glcason-likc derivation of the 
quantum probability rule by Busch [5j and by Caves et al. 
[6j . This "generalized Gleason's theorem" starts from the 
assumption that there exists a probability measure that 



(50) 



8 Note that P a = u a , Pa = ua, etcetera, where cr x is defined in 
Eq. 1141 . This follows from the fact that the rank-1 density op- 
erator associated with a vector is simply the projector onto the 
ray spanned by that vector. It follows that Eqs. 1371 - 1391 and 
Eqs. gnj-E3 are equivalent to Eqs. EH-IID and Eqs. IT51-IT71 
respectively. We use a distinct notation for the same mathemati- 
cal operators to remind the reader of the fact that in this section 
they represent measurement outcomes rather than preparation 
procedures. 



This fact is also captured by Eqs. 1221 and 1231 . 
This fact can also be established by noting that the equivalence 
class includes the measurement M', obtained from M by per- 
muting the two outcomes (because such a permutation does not 
change the statistics of outcomes). The ontological representa- 
tions of M and M' are {?i(A), £ 2 (A)} and {6(A), 6(A)}. Now, 
the assumption of measurement noncontextuality implies that 
since M and M' are in the same equivalence class, they must be 
represented by the same set of indicator functions. Thus we re- 
quire that §i(A) = £ 2 (A). But since £i(A) + £ 2 (A) = 1, it follows 
that £i(A) = £> (A) = 1/2 for all A. 
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assigns a unique probability w(E) to every positive oper- 
ator E such that w(I) = 1, and whenever a set of positive 
operators forms a resolution of identity, J2k ^k = T the 
associated probabilities sum to 1, Ylk w (-^ k ) = ■"•• From 
these assumptions, it is proven that the measure must 
satisfy w(E) = Tt(pE) for some density operator p yjlll- 

Recall that the values of a set of indicator functions 
{£fc(A)} at a particular value of A form a probability dis- 
tribution over k. In a measurement noncontextual the- 
ory, every positive operator E is represented by a unique 
indicator function £e(A), with the identity operator be- 
ing represented by the unit function. Moreover, whenever 
a set of positive operators forms a resolution of identity, 
J2k Ek = I, the associated indicator functions sum to the 
unit function, J^ fc £,E k (X) = 1- Thus, the set of all indica- 
tor functions for a given value of A in an ontological model 
satisfy the assumptions of the set of probability measures 
in the generalized Gleason's theorem. It follows therefore 
that for every value of A in the ontological model, there 
is a density operator p\ such that £e(A) = Tr(p\E). 

If in addition to measurement noncontextuality, one 
assumes outcome determinism for sharp measurements, 
then every projector is represented by a unique idem- 
potent indicator function xp(A), and by the generalized 
Gleason's theorem, 



Xp(X) = Tr(p A F). 



(53) 



Suppose that P = \tp) (tp\, and consider a A such 
that xp(A) = 1. In this case, Eq. i|53|) implies that 
Px = |"0) (01 • But then for some other projector P' = 
\tp'} (ip'\, where < |('(/>|V , ')I < 1' WG have for this 
value of A that xp' = TrO^A-P') = KV'IV'')! an d con ~ 
scqucntly < XP'(X) < 1, which implies that Xp'(X) 1S 
not idempotcnt. Thus, the assumption of noncontextual- 
ity for unsharp measurements and outcome determinism 
for sharp measurements yields a contradiction in a 2d 
Hilbert space. 

This no-go theorem is related to the no-go theorem 
of the previous section in the same way that the no-go 
theorem nl that is obtained from the standard Gleason's 
theorem pj is related to the original Kochcn-Spccker the- 
orem |2| . The former derive a contradiction using the full 
set of measurements, while the latter only make use of a 
finite set. 



VI. PROOF OF TRANSFORMATION 
CONTEXTUALITY IN 2D 

Consider a set of six transformation procedures, de- 
noted To,T 7r /3,T2 W /3,T 7r ,T 47r /3,T 57r /3, where the proce- 
dure Tg corresponds to the CP map 



T s {p) = U y spU\ fil 



and where 



U, 



y,9 



cos ■ 
sin -, 



sin- 



(54) 



(55) 



is the unitary operator describing a rotation by 9 about 
the y axis in the Bloch sphere. Consider also the CP map 
T that takes all points in the Bloch sphere and projects 
them onto the y axis. There are many ways of implement- 
ing T as a convex sum of transformations, specifically, 



T = 



2 T ° + 2 % 



1% 



ir/3 



1% 



tt/3 



^T 



2tt/3 



■=% 



5tt/3 



tt/3 



;^2tt/3 



:Ti 



4tt/3 



:% 



5tt/3 



(56) 

(57) 
(58) 
(59) 
(60) 



These identities can be explained as follows. The map T 
can be achieved by performing with probability 1/2 a ro- 
tation in the Bloch sphere about the y axis by 9 and with 
probability 1/2 a rotation by 9 + ir. Taking 9 = 0, tt/3 
and 27r/3 yields Eqs. j5B jl -l(55 |l . The map T can also 
be achieved by performing a rotation about y by 9, by 
9 + 2tt/3 or by 9 + 4tt/3 with equal probabilities. Taking 
8 = and 7r/3 yields Eqs. (|59|l and 160fl . A rigorous proof 
of these statements is provided in the appendix. 

By the assumption of transformation noncontextual- 
ity each of the seven CP maps we have considered is 
associated with a unique transition matrix on the space 
of ontic states. Suppose that we denote the transition 
matrix associated with T by F, and the transition ma- 
trix associated with Tg by Tg. Because a convex sum of 
transformation procedures is represented in an ontolog- 
ical model by a convex sum of the associated transition 
matrices, Eqs. <|5fil) - l|l)U)) imply 



2 ° 



2 



2 r ir/3 

Ir 

2 



2tt/3 



2 1 4tt/3 

2 



i 

3^/3 



Ir 

3 



5tt/3 
2tt/3 



o 1 4tt/3 



1 1 

or. + -r 57r / 3 



(61) 
(62) 
(63) 
(64) 
(65) 



Note that Tg and 7g +7r take any rank-1 density op- 
erator lying in the z-x plane of the Bloch sphere to a 
pair of orthogonal density operators. Since these are dis- 
tinguishable with certainty, it follows from feature 1 of 
ontological models (see section IIV(I that the transition 
matrices Tg and Tg +T: must take any distribution n x (X) 
to disjoint distributions, that is, 

d\' T 8 (A, A') Mx (A') J dX' T e+7r (A, A') fx x (X') = 0. 

(66) 
Now consider how our seven transition matrices af- 
fect the distribution /U (A) associated with the density 
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operator a a , defined in Eq. Q14[l (recall that a a is rep- 
resented on the Bloch sphere by the vector pointing 
along the z axis). We obtain seven distinct distribu- 
tions, which we denote /ie(A) = J dX' Tg(\,X') fi x (X'), 
for 9 = 0, tt/3, 27r/3,7r,47r/3,57r/3, and fi(X) = 
J dX' r(A, A') Mx(A'). By virtue of Eqs. jHU"© and Eq. 
(|66[) . these seven distributions satisfy Eqs. i|32|) - (|36[) and 
Eqs. (|24[l - i|2()[l where a,A,b,B,c,C are associated with 
9 = 0,7r, 2-7r/3, 57r/3,47r/3, 7r/3 respectively. But Eqs. 
d32I> - (ESI and Eqs. JS]}-® cannot be satisfied simul- 
taneously, so we have arrived at a contradiction 11 . 

In a draft of this article, the question of the existence of 
a no-go theorem for transformation noncontextuality was 
left as an open problem. The question was resolved by 
Terry Rudolph who provided the example given above. I 
am grateful for his permission to present the result here. 



VII. IS THE ASSUMPTION OF 
NONCONTEXTUALITY NATURAL? 

An important question is whether the assumption of 
noncontextuality for preparations, transformations, and 
unsharp measurements is as well motivated as this same 
assumption for sharp measurements, to which the notion 
is usually restricted. To answer this, one must consider 
the motivation for the latter, which seems to be one of 
ontological economy: be wary of introducing differences 
in the ontological explanations of empirical phenomena 
where there are no differences in the phenomena them- 
selves. Einstein's equivalence principle is an example of a 
fruitful application of this principle. If this is indeed the 
motivation, then it clearly also applies to our generalized 
notions of noncontextuality. Specifically, if one believes 
that equivalent statistics suggest equivalent ontological 
representations for sharp measurements, why should one 
not believe this for preparations, transformations, and 
unsharp measurements as well? Thus, barring an alter- 
native motivation for the traditional notion of noncontex- 
tuality, it seems that an ontological model that respects 
the statistical equivalence class structure of preparations, 
transformations, and unsharp measurements is as well (or 
badly) motivated as an ontological model that respects 
the statistical equivalence class structure of sharp mea- 
surements. 

This of course leaves open the question of whether any 
assumption of noncontextuality is natural. The answer 
seems to depend on one's interpretational bent. John 
Bell, for instance, thought that contextuality was not at 
all surprising 12 , whereas David Mermin has characterized 



it as a mystery in need of explanation 13 . 

In order to defend the view that measurement contex- 
tuality is indeed mysterious within the framework of an 
ontological model, we show that the reasons for thinking 
so are very similar to the reasons for thinking that non- 
locality is mysterious. Disregarding classical prejudice, 
nonlocality is not an unreasonable assumption. How- 
ever, if the universe is fundamentally nonseparablc or is 
such that causal influences can propagate faster than the 
speed of light, then why should it also be the case that 
one cannot use these effects to achieve super-luminal sig- 
nalling? Given the presence of nonlocality at the on- 
tological level, it seems almost conspiratorial that one 
cannot make use of this nonlocality for signalling. Simi- 
larly, it is certainly not unreasonable for the statistics of 
experimental outcomes for a given ontic state to depend 
on details of the experimental procedure. But assum- 
ing this to be the case, it is very surprising that when 
one considers any valid probability distribution over the 
ontic states (that is, any distribution that characterizes 
what someone who knows only the preparation procedure 
knows about the ontic state), the weighted average over 
the statistics of outcomes does not depend on the details 
of the experimental procedure. Again, this seems almost 
conspiratorial. This analogy suggests that removing the 
appearance of conspiracy from contextuality may well be 
on a par with reconciling Bell's theorem and relativity as 
a guide for progress in the search for a wholly satisfactory 
realist interpretation of quantum theory. 

It is likely that the notion of preparation noncontex- 
tuality will also seem natural to some and unnatural to 
others. To shed some light on the diversity of reactions, it 
is useful to distinguish two different types of ontological 
model of quantum theory. Specifically, we distinguish 
what we call th e ep istemic view and the ontic view of 
quantum states |25j . 

The epistemic view of quantum states asserts that a 
density operator represents nothing more than an agent's 
knowledge about the ontic state of the system. Specifi- 
cally, it represents the knowledge of someone who knows 
only the preparation procedure. In this view, the on- 
tic state of a system does not fix the density operator 
that is used to describe it. Distinct non-orthogonal den- 
sity operators (including the pure cases) are represented 
by overlapping probability distributions within this view 
and are thus consistent with a single ontic state. By con- 
trast, the ontic view of quantum states asserts that the 
density operator itself represents an attribute of the sys- 
tem, and consequently that two distinct density opera- 
tors represent mutually exclusive physical states of affairs 



11 It should be noted that the above argument is equivalent to 
a proof of preparation contextuality in four dimensions if one 
makes use of the Jamiolkowski isomorphism betwe en density op- 
erators in a 4d space and CP maps in a 2d space |2(| . 

12 Bell states: "The result of an observation may reasonably depend 
not only on the state of the system (including hidden variables) 



but also on the complete disposition of the apparatus." 
13 Mermin states: "if one is attempting a hidden variable model at 
all, it seems not unreasonable to expect the model to provide the 
obvious explanation for this striking insensitivity of the distribu- 
tion to changes in the experimental arran gem ent — namely, that 
the hidden variables are noncontextual" |2I|| 
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and are therefore represented in the ontological model by 
nonovcrlapping (i.e. disjoint) probability distributions. 

To be precise, for a set S of density operators (assumed 
to contain some nonorthogonal elements), an ontological 
model adopts an ontic view of S if all distinct elements 
of S are represented by disjoint distributions, that is, 

p 7^ p' implies p p (X)p p > (A) = for all p, p' e S, (67) 

whereas an ontological model adopts an epistemic view 
of S if only orthogonal elements of S are represented by 
disjoint distributions 

Pp(X)p P ' (A) = only if pp' = 0, for all p,p' eS (68) 

In other words, in an epistemic view of S, being orthogo- 
nal is a necessary condition for a pair of quantum states 
to be represented by disjoint distributions (the argument 
presented at the beginning of section IIVI shows that or- 
thogonality is a sufficient condition for disjointness, re- 
gardless of whether one adopts an ontic or an epistemic 
view.) 

We now show that an ontic view of the set of pure 
quantum states rules out the possibility of preparation 
noncontextuality trivially. Our purpose here is to show 
that an implicit commitment to such a view can lead to 
the impression that the assumption of preparation non- 
contextuality is unnatural. 

Consider the four preparation procedures P^P^Pfe 
and Pb from section llVl represented in quantum theory 
by the Hubert space vectors ip a ,ipA,'4 ) b and ips respec- 
tively. An ontic view of pure quantum states implies 
that not only are the orthogonal states associated with 
disjoint distributions, 



Ma(A)/iA(A) = 

p b (X)p B W = 0, 



(69) 
(70) 



but also nonorthogonal states are associated with disjoint 
distributions, 



// a (A)/i 6 (A) 


= 


(71) 


^(A)^b(A) 


= 


(72) 


/^a(A)^s(A) 


= 


(73) 


^a(A)^b(A) 


= 


(74) 
(75) 



It is then clear that the preparation procedures P a A and 
PbB, obtained respectively by implementing P a and Pa 
with equal probability, or Pj, and Pb with equal probabil- 
ity, are represented by distributions p a A and pbB (defined 
in Eqs. I|27|> and J2HJ0 that are also disjoint, 



fJ-aAWfJ-bBW = 0. 



(76) 



However, since these two procedures are represented by 
the same density operator, namely 1/2, they must be 
represented by the same distribution in a preparation 
noncontextual model. Thus, an ontic view of quantum 



states trivially precludes the possibility of preparation 
noncontextuality. 

Since our manner of speaking about pure quantum 
states typically favors the ontic view of the latter, it 
also tends to make the assumption of preparation non- 
contextuality seem implausible. The very term "quan- 
tum state" already predisposes one to thinking of the 
density operator as representing the physical state of af- 
fairs rather than an agent's knowledge. For instance, in 
the context of photon polarization, the multiplicity of 
convex decompositions of the completely mixed state is 
sometimes summarized as follows: "an equal mixture of 
states of horizontal and vertical polarization is statisti- 
cally indistinguishable from an equal mixture of states 
of left and right circular polarization" . Implicit in this 
sort of language is the assumption that the four different 
states of polarization are mutually exclusive states of af- 
fairs and are therefore ontic states. Indeed, this way of 
putting things compels us to question (in vain) whether 
there isn't really some measurement that could tell these 
two cases apart. However, it is wrong to take this as an 
argument against the "naturalness" of preparation non- 
contextuality because this impression can be attributed 
entirely to the language that is used to describe the phe- 
nomenon. 

If one is to take the epistemic view seriously, as one 
should in an investigation of the possibility of an onto- 
logical model of quantum theory, then this sort of lan- 
guage must be avoided, and the assumption of prepa- 
ration noncontextuality is a priori very plausible. In- 
deed, in light of the arguments that have recently been 
made in favor of the epistemic view of quantum states 
[22l I23I l24l |25| and the fact that one can reproduce 
qualitatively ma ny qua ntum phenomena in noncontex- 
tual theories J25|, [2g, [23, |28j , the impossibility of a prepa- 
ration noncontextual ontological model appears all the 
more shocking to the devoted realist. 



VIII. THE ISSUE OF OUTCOME 
DETERMINISM 

In our proof of contextuality for unsharp measure- 
ments, we assumed outcome determinism for sharp mea- 
surements but we assumed outcome mdeterminism for 
unsharp measurements. This amounts to representing 
all and only those POVMs with idempotent elements by 
sets of indicator functions that are idempotent. Although 
this seems like a natural assumption to make, two alter- 
native assumptions might seem a priori worth consid- 
ering: (1) that both sharp and unsharp measurements 
are outcome-deterministic, or (2) that both are outcome- 
indctcrministic . 

We begin by considering the first alternative, that out- 
come determinism also holds for unsharp measurements. 
It turns out that this is trivially inconsistent with assum- 
ing measurement noncontextuality. Consider a measure- 
ment procedure M associated with the POVM {1/2, 1/2}. 



13 



As argued in section the assumption of measurement 
noncontextuality implies that M must be represented in 
an ontological model by the set of indicator functions 
{1/2, 1/2} which are not idempotent, and thus M can- 
not be outcome deterministic. A recent result by Ca- 
bello [4J also rules out the possibility of a hidden variable 
model that is measurement noncontextual and outcome 
deterministic for unsharp measurements. However, this 
proof is unnecessarily complex since a consideration of 
the POVM {7/2, 1/2} yields the result immediately. 

The second alternative is that both sharp and unsharp 
measurements are outcome-indctcrministic. This is the 
more significant alternative, because it constitutes the 
weakest assumption and consequently the most general 
framework for an ontological model. Indeed, unless the 
assumption of outcome determinism can itself be justi- 
fied by the assumption of noncontextuality, it is inappro- 
priate to call any no-go theorem that makes use of this 
assumption a proof of contcxtuality, because in the face 
of a contradiction one can always assume that the faulty 
assumption was that of outcome determinism rather than 
that of measurement noncontextuality. Thus, neither the 
proof of Bell yU, nor the proof of Kochen and Specker 
, nor any of the proofs of these types including those 
presented in section [3 serve to rule out the possibility 
of measurement noncontextuality (in the sense in which 
we have defined the term). It turns out, however, that 
the assumption of outcome determinism for sharp mea- 
surements can be justified by an assumption of prepara- 
tion noncontextuality, as we shall presently demonstrate. 
Given this inference, the old proofs are vindicated inso- 
far as they remain proofs of the impossibility of universal 
noncontextuality (noncontextuality for all experimental 
procedures). 

It should be noted that Toner, Bacon, and Ben-Or [l|| 
have considered a third alternative, namely, that outcome 
determinism holds for just those POVMs with elements 
that are not repeatable, that is, elements that cannot 
appear twice in a single POVM, and have obtained a 
nontrivial no-go theorem. Bacciagaluppi |l7| has consid- 
ered a similar alternative and obtained a similar result. 
Although this is a much weaker assumption than the first 
alternative, the resulting theorems are still not proofs of 
the impossibility of universal noncontextuality, according 
to our definition, since the assumption of outcome deter- 
minism for these special POVMs has not been justified 
by an assumption of universal noncontextuality. In con- 
trast, outcome determinism for all sharp measurements 
can be so justified. We turn now to the proof of this 
statement. 



A. Preparation noncontextuality implies outcome 
determinism for sharp measurements 

Consider a rank-1 PVM {Pk}- Thinking of each of the 
elements as a rank-1 density operator, pk = Pk, we ob- 
tain an orthogonal set of rank-1 density operators {pk}- 



We denote the density operators and projectors differ- 
ently because they arc represented differently in the on- 
tological model. The set {pk} is represented by a set 
of probability densities {pk(X)}, while the PVM {Pk} is 
represented by a set of indicator functions {£fc(A)|. Since 
the pk arc orthogonal, the associated preparations are 
distinguishable with certainty, and thus by feature 1 of 
ontological models we must have 



Vk(X)pk>(X) = 8k,k' 



(77) 



The support of /ifc(A), denoted flk, is the region of the 
ontic state space assigned non-zero probability by Pk(X), 



O fe = {X\p k (X) > 0}. 
Eq. l(77|) then implies that 

n k n VL k , = if k ^ k! . 

Now, by virtue of the fact that 

Tr(pkPk') = h,k>, 
we infer that 

/a(A)/Xfc'(A) = 4,fe'. 

But, given Eq. (|78|l . this implies that 

a(A) 

or, equivalcntly, 



1 for A € il k 

for A e U k ^kttk> ' 



&(A)fr(A) = tf fc ,fc/ forAeU.%. 



(78) 
(79) 

(80) 
(81) 

(82) 
(83) 



So, if one can show that the union of the supports of the 
Pk(X) is the entire ontic state space, i.e., 



Ui-ft, = Q, 



(84) 



then Eq. (|83|l would imply that {£fc(A)| is a set of idempo- 
tent indicator functions, and consequently would estab- 
lish that our rank-1 PVM must be outcome-deterministic 
in the ontological model. 

It turns out that Eq. I|84|) follows from the assumption 
of preparation noncontextuality. First note that the ontic 
state space fl can be defined as the set of A that are 
assigned non-zero probability by some density operator 



il = {X\p p (X) > for some p} 



(85) 



However, since every density operator p appears in some 
convex decomposition of the completely mixed state 1 j d 
(where d is the dimensionality of the Hilbcrt space) , and 
since preparation noncontextuality implies that there is 
a unique distribution pi/d(X) associated with this state, 
it follows that il is simply the set of A assigned non-zero 
probability by the latter, i.e., 



ft = {X\p I/d (X) > 0} 



(86) 
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But given that the pk form a convex decomposition of 
I/d, 



(87) 



^1 / 

2^~d Pk = d> 

k 

it follows from preparation noncontextuality that 



which implies Eq. Q84|). 

This establishes outcome determinism for PVMs all of 
whose elements are rank 1. Since an arbitrary PVM can 
always be obtained by coarse-graining of a rank-1 PVM, 
and since coarse-graining takes idempotent functions to 
idempotent functions, any PVM is represented by a set of 
idempotent indicator functions. This establishes that the 
assumption of outcome determinism for sharp measure- 
ments follows from an assumption of preparation non- 
contextuality. 

It is natural to wonder whether outcome determinism 
for sharp measurements might be justified by an assump- 
tion of measurement noncontextuality (rather than an as- 
sumption of preparation noncontextuality). If this were 
possible, then the proofs in section IVI would derive con- 
tradictions from measurement noncontextuality alone. It 
turns out that this is not possible, because measurement 
noncontextuality on its own is consistent with quantum 
theory, as we now show. 



B. Achieving measurement noncontextuality by 
giving up outcome determinism 

Consider the following ontological model of quantum 
theory, which is objectively indeterministic and adopts 
an ontic view of quantum states. The ontic state space 
fi is simply taken to be the projective Hilbcrt space, that 
is, the set of rays of Hilbert space. Thus, for every rank-1 
projector \i/j) (ip\ , we associate a single ontic state, which 
we denote by ip. Consequently, there are no hidden vari- 
ables in this ontological model. A preparation procedure 
associated with the rank-1 density operator \tp') (ip'\ is 
represented by a Dirac-delta distribution 



Mt/1'WO = 8{ip-ip') 



(89) 



A preparation procedure involving a convex combina- 
tion of rank-1 density operators {p(tp'), \tp') {ip'\} is rep- 
resented by the distribution 



n(i/>) = / di>'p(ip')8(ip - ip'), 



(90) 



where dip is the unitarily-invariant measure on the pro- 
jective Hilbert space. A measurement of the POVM {Qk} 
is associated with a set of indicator functions {^Q fc (A)} 
defined by 



fo»(tf)=T>(Q*WM). 



(91) 



These functions are clearly positive by virtue of the pos- 
itivity of the Qk, and sum to unity by virtue of the fact 
that J^k Q k ~ I- Note also that they depend only on the 
POVM that is associated with the measurement and not 
on how it was implemented. One can see that this model 
reproduces quantum theory by noting that 



wWSqMW = MQk W) W\). (92) 



The predictions for mixed preparations are also repro- 
duced. 

This model has been discussed at length by Beltram- 
etti and Bugajski J8J, and captures to some extent the 
ontological model that many physicists implicitly adhere 
to. Note that the model is obviously preparation con- 
textual since the distribution that represents a convex 
combination of preparation procedures, described in Eq. 
(PUl) . depends on the particular ensemble of pure states, 
and not just on the density operator associated with the 
mixture. This fact comes as no surprise since the results 
of section IIVI show that any ontological model, deter- 
ministic or not, must be preparation contextual. More 
importantly for the purposes of this section, the set of 
indicator functions associated with any PVM {Pk} are 
not idempotent. This is clear since Tr(P^ \ip) (ip\) is only 
or 1 if | ip) lies in an eigenspace of Pk- It follows that 
the assumption of outcome determinism for sharp mea- 
surements is explicitly violated. However, because the 
set of indicator functions depends only on the POVM, 
and not on its context, the assumption of measurement 
noncontextuality is upheld. 



IX. CONCLUSIONS 

Because the traditional notion of noncontextuality 
only allowed for a no-go theorem in Hilbert spaces of 
dimensionality greater than two, there have been many 
proposed hidden variable models for 2d Hilbert spaces 
that are purported to be noncontextual [l], Q- These 
have been presented primarily as pedagogical examples 
of what sort of model is excluded for larger-dimensional 
Hilbcrt spaces. However, by our generalized definition 
of noncontextuality, all of these models are deemed con- 
textual by virtue of being contextual for preparations, 
transformations, and unsharp measurements. This over- 
turns the notion, suggested by the restriction of old Bell- 
Kochen-Specker theorems to Hilbert spaces of dimension- 
ality greater than two, that there is nothing inherently 
nonclassical about a 2d Hilbcrt space |29j . 

In the face of this claim, a sceptic might argue that 
the proofs presented here have made use of mixed prepa- 
rations, unsharp measurements, and irreversible trans- 
formations (associated respectively with non-rank-1 den- 
sity operators, non-projective POVMs, and non-unitary 
CP maps), and that these are necessarily implemented 
in practice through pure preparations, sharp measure- 
ments, and reversible transformations (associated rcspec- 
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tively with rank-1 density operators, PVMs, and unitary 
maps) on a larger system and therefore implicitly make 
use of a Hilbert space of dimension greater than two. 
However, this is incorrect. If one examines carefully the 
proofs presented in this article, one finds that wherever 
non-rank-1 density operators, non-projective POVMs, or 
non-unitary CP maps arise, they arc due to ignorance of 
which of several rank-1 density operators, PVMs, or uni- 
tary maps in the 2d Hilbert space is appropriate, rather 
than being due to the neglect of a subspace or subsystem 
of a larger dimensional Hilbert space. In other words, 
any "ancillary" systems used to implement such proce- 
dures can be treated classically, and thus do not require 
one to posit a larger Hilbert space. 

Our operational definition of noncontcxtuality has al- 
lowed us to distinguish the notions of preparation, trans- 
formation, and measurement noncontextuality. Our 
proof of preparation contextuality is particularly novel as 
a no-go theorem insofar as it focusscs on the impossibil- 
ity of reproducing, within a particular kind of ontological 
model, the convex structure of the set of quantum states 
rather than the algebraic structure of the set of quantum 
measurements. It is interesting to note that wherever 
one finds a freedom of decomposition in the formalism 
of operational quantum theory, such as the multiplicity 
of convex decompositions of a mixed quantum state or 
of a POVM clement, the multiplicity of fine-grainings 
of a non-rank-1 POVM, or the unitary freedom in the 
operator-sum representation of a non-unitary CP map, 
one can develop a proof of contextuality that is based on 
this freedom. 

We have shown that one can confine all the contextual- 
ity into the preparations and transformations if one likes, 
because there exist outcome-indeterministic ontological 
models of quantum theory, such as the Bcltramctti- 
Bugajski model, that arc measurement noncontcxtual. 
On the other hand, one cannot confine all the contex- 
tuality into the measurements, because the assumption 
of preparation noncontextuality yields a contradiction on 
its own. In this sense, preparation contextuality is more 
fundamental to quantum theory than measurement con- 
textuality. 

The issue of noncontextuality is closely linked with the 
issue of locality. Indeed, it is sometimes claimed that 
nonlocality is an instance of measurement contextuality. 
If this were the case, then proofs of nonlocality would 
also constitute proofs of measurement contextuality, and 
since there exist proofs of nonlocality that do not assume 
outcome determinism for sharp measurements, it would 
appear that there should exist proofs of measurement 
contextuality that do not make this assumption either. 
But this would be in contradiction with the claims of the 
previous section. 

The resolution of this puzzle is that one can distin- 
guish two sorts of locality [3(| , and it is only the failure 
of one of these that implies measurement contextuality. 
The first notion of locality, which we call separability, 
is the assumption that the ontic state of the universe 



is defined in terms of the ontic states at each point of 
space-time. The other sort of locality assumption, which 
presumes separability, we call local causality. It is the 
assumption that the probability distribution over values 
for a variable in a space-time region are determined by 
the values of all the variables in the backward light-cone 
of this region (see footnote in scction lTTT|l . A failure of lo- 
cal causality within the framework of a separable model 
does indeed imply measurement contextuality. However, 
a model can be nonlocal by virtue of failing to be separa- 
ble, and in this case it does not follow that the model is 
measurement contextual. This is precisely what occurs 
in the Beltrametti-Bugajski model. The variables for a 
composite system are not simply the Cartesian product 
of the variables of the components, since the Cartesian 
product of two projective Hilbert spaces is not the projec- 
tive Hilbert space of the tensor product (it fails to include 
the entangled states). In particular, spatially separated 
systems are not associated with distinct variables. Thus, 
the Beltrametti-Bugajski model is not separable. It is 
only within the context of a separable theory that Bell's 
theorem implies measurement contextuality. 

The opposite inference has also received a great deal of 
attention: whether a proof of measurement contextuality 
can be turned into a proof of nonlocality (note that the 
question is only interesting if one presumes separability 
since otherwise one is already acknowledging a failure of 
some sort of locality) . The motivation for this investiga- 
tion is clear since, as Bell famously emphasized, an as- 
sumption of measurement noncontcxtuality is most com- 
pelling if it can be justified by an assumption of locality 
[3lj . Many authors have shown how certain no-go the- 
orems for measurement noncontcxtuality can be turned 
into no-go theorems for locality by virtue of the fact that 
sometimes every assumption of measurement noncontex- 
tuality in a Bcll-Kochcn-Specker theorem can be justified 
by an assumption of locality |2lLl32| . It turns out that the 
same trick can be achieved in no-go theorems for prepa- 
ration noncontcxtuality. Although the particular proof 
of the no-go theorem presented in section IIVI does not 
admit such a justification, a proof can be found which 
does. This will be presented in a separate article [33| - 
The version of Bell's theorem that results is particularly 
enlightening, as it constitutes a more direct response to 
the EPR argument J3J] compared to standard versions of 
the theorem. 

It should be noted that there are contexts that do not 
have any representation in the formalism of operational 
quantum theory. Whether one uses a piece of polaroid or 
a birefringent crystal in a measurement of photon polar- 
ization is an example of such a context. No dependence 
on this sort of context is implied by any of the no-go 
theorems we have presented 14 . Nonetheless, some hid- 



14 If, however, one decides to treat part of the experimental appa- 
ratus as a quantum system, then this sort of distinction could 
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den variable theories still exhibit such dependence. For 
instance, it has been shown that the deBroglic-Bohm 
interpretation has this sort of context-dependence for 
certain position measurements |35l |36j and for certain 
spin measurements J32|. Thus, the deBroglie-Bohm in- 
terpretation involves more contextuality than has been 
shown to be required of an ontological model. Note that 
Rcfs. |3a, I36J l37l | explicitly identify this feature of the 
deBroglic-Bohm interpretation as a kind of contextual- 
ity despite the fact that it does not fit into the standard 
definition of contextuality presented in the introduction. 
The possibility of this type of phenomenon was in fact 
considered in the framework of a general hidden variable 
theory much earlier by Shimony |23 , who also described 
it as a kind of contextuality. This highlights another 
virtue of our generalized definition of contextuality: it 
accords with the intuition that a measurement context is 
any feature of the measurement that is not specified by 
specifying its equivalence class. 

An operational definition of noncontextuality is also 
likely to be useful because it allows one to investigate 
the possibility of finding ways of experimentally differ- 
entiating the set of noncontcxtual theories from the set 
of contextual theories, much as the Bell inequalities dif- 
ferentiate all local realistic theories from their alterna- 
tives. If these investigations are successful, they could 
shed light on the question of how to perform experimen- 
tal tests of contextuality, a subject of much recent inter- 
est [33, |4(j, EI| . The question of whether an experimental 
test of contextuality is even possible has been the subject 
of some controversy, due to the fini te p recision of real 
experimental procedures |43. I43J I4J. l45j . The problem, 
from the perspective of this article, is that finite precision 
might imply that in practice no two experimental proce- 
dures are found to be operationally equivalent, in which 
case the assumption of noncontextuality is never applica- 
ble. A possible resolution of this finite precision loophole 
is to further generalize the definition of noncontextuality 
proposed in the introduction as follows: 



A noncontextual ontological model of an op- 
erational theory is one wherein if two exper- 
imental procedures are operationally similar, 
then they have similar representations in the 
ontological model. 

To be substantive, this proposal must be supplemented 
by a quantitative measure of similarity in the space of 
operational procedures, and a corresponding measure in 
the space of ontological representations of these proce- 
dures. Whether this strategy can lead to an experimen- 
tally robust notion of contextuality is a subject for future 
research. 

Finally, given the fact that some quantum information 
processing protocols, namely, protocols for communica- 
tion complexity problems |46J, have been proven to re- 
quire violations of Bell inequalities in order to outperform 
their classical counterparts, it is interesting to investigate 
whether the power of any quantum information process- 
ing protocols might be attributed to the contextuality of 
quantum theory. There is already some evidence to this 
effect in the case of random access codes J47J . We specu- 
late that this might also be the case for the exponential 
speed-up of a quantum computer relative to a classical 
computer, if such a speed-up exists. 
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APPENDIX A: PROOF OF EQS. (!55Tl-(l5oTl 

To demonstrate Eqs. (|56|) - (I6(J|) . we make use of the 
fact that there is unitary freedom in the operator-sum 
representation of a CP map [48|. Suppose that {W^} are 
a set of operators (called Kraus operators) appearing in 
an operator-sum representation of T, that is, 



T(p) = J2 W ^ W l- 



(Al) 



Then, for any unitary matrix u vfl , the set of operators 
{X u } defined by 



x„ = 'y] u vii w,j, 



(A2) 



also forms an operator-sum representation of T . Note 
that we allow Kraus operators to be zero, so that dif- 
ferent operator-sum representations may have different 
cardinality. 

Eq. (|56[) implies that T has an operator-sum represen- 
tation in terms of the set of Kraus operators {Wi, Wi\ = 



since 



T(P) = ^U Q pUl + -U^pUl 



(A3) 



The set of operators {Xi,X 2 } = {-j=Ue,-j=Ue + n} also 
yield an operator-sum representation of T since they 
can be obtained by a unitary remixing of {W\, W2}, via 
Eq. i|A2(l . using the 2x2 unitary matrix 



cos 



2 

sin ■ 



sin ■ 



(A4) 
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It follows, in particular, that the sets of Kraus opera- 
tors {^U n/3 , -t=[/ 47r/3 } and {^U 2 „/ 3 , -±=[/ 57r/3 } form 
operator-sum representations of T, and consequently 
that Eqs. JSZl and (JSHJl hold. 

Next, we show that the set of operators {X\,X 2 , X 3 } = 
{^ U e^ U e+2n/3,^U e+ ^ /3 } also yield an operator- 
sum representation of T. First note that the set 
{Wi,W a ,W 3 } = {-^U o ,-^U^,0} yields the operator- 
sum representation of T associated with Eq. (|56|l . The 
operators {Xi,X2,X 3 } can be obtained by a unitary 
remixing of {Wi, W2, W 3 } using the 3x3 unitary matrix 



/ 



2 (9 

3 COS ^ 

|cos(| 
|cos(f 



2ir \ 
3 / 

3 I 



1 • 
| Sm 2 

|sin(| 

|sin(f 



2,T ■ 

3 ' 

■in ' 
3 ' 



3 



(A5) 



It follows, in particular, that {^U , -L[/ 27r/3 , -j=U iv / 3 } 
and {^U„/ 3 , -^Uk, 75^/3} form operator-sum repre- 
sentations of T and consequently that Eqs. I|59|) and l|6(J|) 
hold. 



